Search results for "hyperparameter optimization"

showing 5 items of 5 documents

Adjusted bat algorithm for tuning of support vector machine parameters

2016

Support vector machines are powerful and often used technique of supervised learning applied to classification. Quality of the constructed classifier can be improved by appropriate selection of the learning parameters. These parameters are often tuned using grid search with relatively large step. This optimization process can be done computationally more efficiently and more precisely using stochastic search metaheuristics. In this paper we propose adjusted bat algorithm for support vector machines parameter optimization and show that compared to the grid search it leads to a better classifier. We tested our approach on standard set of benchmark data sets from UCI machine learning repositor…

0209 industrial biotechnologyWake-sleep algorithmActive learning (machine learning)Computer scienceStability (learning theory)Linear classifier02 engineering and technologySemi-supervised learningcomputer.software_genreCross-validationRelevance vector machineKernel (linear algebra)020901 industrial engineering & automationLeast squares support vector machine0202 electrical engineering electronic engineering information engineeringMetaheuristicBat algorithmStructured support vector machinebusiness.industrySupervised learningOnline machine learningParticle swarm optimizationPattern recognitionPerceptronGeneralization errorSupport vector machineKernel methodComputational learning theoryMargin classifierHyperparameter optimization020201 artificial intelligence & image processingData miningArtificial intelligenceHyper-heuristicbusinesscomputer2016 IEEE Congress on Evolutionary Computation (CEC)

researchProduct

An LP-based hyperparameter optimization model for language modeling

2018

In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find per…

FOS: Computer and information sciencesMathematical optimizationPerplexityLinear programmingComputer scienceMachine Learning (stat.ML)02 engineering and technology010501 environmental sciences01 natural sciencesTheoretical Computer ScienceNonlinear programmingMachine Learning (cs.LG)Random searchSimplex algorithmSearch algorithmStatistics - Machine Learning0202 electrical engineering electronic engineering information engineeringFOS: MathematicsMathematics - Optimization and Control0105 earth and related environmental sciencesHyperparameterComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Computer Science - LearningHardware and ArchitectureOptimization and Control (math.OC)Hyperparameter optimization020201 artificial intelligence & image processingLanguage modelSoftwareInformation Systems

researchProduct

A heuristic, iterative algorithm for change-point detection in abrupt change models

2017

Change-point detection in abrupt change models is a very challenging research topic in many fields of both methodological and applied Statistics. Due to strong irregularities, discontinuity and non-smootheness, likelihood based procedures are awkward; for instance, usual optimization methods do not work, and grid search algorithms represent the most used approach for estimation. In this paper a heuristic, iterative algorithm for approximate maximum likelihood estimation is introduced for change-point detection in piecewise constant regression models. The algorithm is based on iterative fitting of simple linear models, and appears to extend easily to more general frameworks, such as models i…

0301 basic medicineStatistics and ProbabilityMathematical optimizationIterative methodHeuristic (computer science)Linear model01 natural sciencesPiecewise constant model Approximate maximum likelihood Model linearization Grid search limitations010104 statistics & probability03 medical and health sciencesComputational MathematicsDiscontinuity (linguistics)030104 developmental biologyHyperparameter optimizationCovariatePiecewise0101 mathematicsStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaChange detectionMathematics

researchProduct

Online Hyperparameter Search Interleaved with Proximal Parameter Updates

2021

There is a clear need for efficient hyperparameter optimization (HO) algorithms for statistical learning, since commonly applied search methods (such as grid search with N-fold cross-validation) are inefficient and/or approximate. Previously existing gradient-based HO algorithms that rely on the smoothness of the cost function cannot be applied in problems such as Lasso regression. In this contribution, we develop a HO method that relies on the structure of proximal gradient methods and does not require a smooth cost function. Such a method is applied to Leave-one-out (LOO)-validated Lasso and Group Lasso, and an online variant is proposed. Numerical experiments corroborate the convergence …

HyperparameterComputer scienceStability (learning theory)Approximation algorithm020206 networking & telecommunications02 engineering and technologyStationary pointLasso (statistics)Hyperparameter optimization0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingProximal Gradient MethodsOnline algorithmAlgorithm2020 28th European Signal Processing Conference (EUSIPCO)

researchProduct

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

2022

International audience; Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approxim…

FOS: Computer and information sciencesbilevel optimizationComputer Science - Machine Learninghyperparameter selec- tionMachine Learning (stat.ML)[MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC]generalized linear modelsMachine Learning (cs.LG)Convex optimizationStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Optimization and Control (math.OC)FOS: Mathematics[MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC]hyperparameter optimizationLassoMathematics - Optimization and Control[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]

researchProduct